A linear mapping L:Rn→Rn is sometimes called a linear operator. This is often done when we wish to stress the fact that the domain and codomain of the linear mapping are the same.
Let B={v1,...,vn} be a basis for Rn and let L:Rn→Rn be a linear operator. The B-matrix of L is defined to be
[L]B=[[L(v1)]B⋅⋅⋅[L(vn)]B]
It satisfies [L(x)]B=[L]B[x]B.
e.g. Let L be a linear mapping with standard matrix [L]=[3−1−13] and B={[11],[1−1]}. Find the B-matrix of L and use it to determine [L(x)]B where [x]B=[11].
[L]B=[[L([11])]B[L([1−1])]B]L([11])=[3−1−13][11]=[22].
To find [22]B, we solve c1[11]+c2[1−1]=[22].
c1+c2=2c1−c2=2
Adding equations: 2c1=4⇒c1=2. Then 2+c2=2⇒c2=0. So [L(11)]B=[20].
L([1−1])=[3−1−13][1−1]=[4−4].
To find [4−4]B, we solve d1[11]+d2[1−1]=[4−4].
d1+d2=4d1−d2=−4
Adding equations: 2d1=0⇒d1=0. Then 0+d2=4⇒d2=4. So [L(1−1)]B=[04].
So, [L]B=[2004].
Then [L(x)]B=[L]B[x]B=[2004][11]=[24].
e.g. Let L:R2→R2 be the linear mapping defined by L(x1,x2)=(2x1+3x2,4x1−x2) and B={[12],[1−1]}. Find the B-matrix of L and use it to determine [L(x)]B where [x]B=[11].
The standard matrix is [L]=[243−1].
L([12])=[243−1][12]=[2+64−2]=[82].
To find [82]B: c1[12]+c2[1−1]=[82].
c1+c2=82c1−c2=2
Adding: 3c1=10⇒c1=10/3. Then 10/3+c2=8⇒c2=24/3−10/3=14/3. So [L(12)]B=[10/314/3].
L([1−1])=[243−1][1−1]=[2−34+1]=[−15].
To find [−15]B: d1[12]+d2[1−1]=[−15].
d1+d2=−12d1−d2=5
Adding: 3d1=4⇒d1=4/3. Then 4/3+d2=−1⇒d2=−3/3−4/3=−7/3. So [L(1−1)]B=[4/3−7/3].
Then [L]B=[10/314/34/3−7/3]. (Using my corrected c2)
[L(x)]B=[10/314/34/3−7/3][11]=[10/3+4/314/3−7/3]=[14/37/3].
An n×n matrix D is said to be a diagonal matrix if dij=0 for all i=j. We denote a diagonal matrix by diag(d11,d22,...,dnn).
The matrix [2004] is a diagonal matrix.
Theorem 6.1.7
If A and B are n×n matrices such that P−1AP=B for some invertible matrix P, then
rank A= rank B.
det A= det B.
tr A=tr B where tr A is defined by tr A=∑i=1naii and is called the trace of a matrix.
If A and B be n×n matrices such that P−1AP=B for some invertible matrix P, then A is said to be similar to B.
If A is similar to B, prove that An is similar to Bn.
6.2 Eigenvalues and Eigenvectors
Let A be an n×n matrix. If there exists a vector v=0 such that Av=λv, then the scalar λ is called an eigenvalue of A and v called an eigenvector of A corresponding to λ. The pair (λ,v) is called an eigenpair.
Let L:Rn→Rn be a linear operator. If there exists a vector v=0 such that L(v)=λv then λ is called an eigenvalue of L and v is called an eigenvector of L corresponding to λ.
Why do you think we have the v=0 condition? (If v=0, then A0=λ0 becomes 0=0 for any λ, which is trivial and doesn't define unique eigenvalues.)
e.g. Consider again the linear mapping L:R2→R2 with standard matrix [L]=[3−1−13]. As we saw:
L(1,1)=[L][11]=[3−1−13][11]=[22]=2[11]L(1,−1)=[L][1−1]=[3−1−13][1−1]=[4−4]=4[1−1]
Thus, (2,[11]) and (4,[1−1]) are eigenpairs of [L].
Also, 2 is an eigenvalue of L with eigenvector [11], while 4 is another eigenvalue with corresponding eigenvector [1−1].
e.g. Determine which of the following vectors are eigenvectors of A=2−3−11−2−11−30.
(a) v1=111. Av1=2+1+1−3−2−3−1−1+0=4−8−2. This is not λv1. No
(b) v2=−131. Av2=2(−1)+1(3)+1(1)−3(−1)+(−2)(3)+(−3)(1)−1(−1)+(−1)(3)+0(1)=−2+3+13−6−31−3+0=2−6−2.
This is (−2)−131=(−2)v2. So v2 is an eigenvector with eigenvalue λ=−2. Yes
(c) v3=10−1. Av3=2(1)+1(0)+1(−1)−3(1)+(−2)(0)+(−3)(−1)−1(1)+(−1)(0)+0(−1)=2−1−3+3−1=10−1.
This is (1)10−1=(1)v3. So v3 is an eigenvector with eigenvalue λ=1. Yes
(d)If (λ,v) is an eigenpair of A, then is (2λ,2v) another eigenpair of A? No.
A(2v)=2(Av)=2(λv)=(2λ)v. For (2λ,2v) to be an eigenpair, we would need A(2v)=(2λ)(2v)=4λv. This means 2λv=4λv, which implies 2λv=0. Since v=0, then 2λ=0⇒λ=0. So only if λ=0. No, only true when λ=0 or v=0. Since v=0, only when λ=0.
e.g. Can you imagine a scenario where (λ,v1) and (λ,v2) are eigenpairs of A, with v1=v2? Yes. For example, A=[3003], λ=3. v1=[10] and v2=[01].
From the definition, an eigenpair of A, (λ,v), requires v=0 with Av=λv.
Av−λv=0Av−λIv=0(A−λI)v=0
where v is a solution to the homogeneous system [A−λI∣0].
Since we need v=0 the eigenvalue λ exists if and only if this system has infinitely many solutions (i.e., non-trivial solutions).
In turn, this means A−λI is not invertible, or, det(A−λI)=0.
Once such a λ has been found, we can determine its associated eigenvectors by solving the homogeneous system (A−λI)v=0.
Find all the eigenvalues of A=[0110]. Determine all eigenvectors associated with each eigenvalue.
det(A−λI)=−λ11−λ=(−λ)(−λ)−(1)(1)=λ2−1.
Set λ2−1=0⇒(λ−1)(λ+1)=0.
So, λ1=1,λ2=−1.
For λ1=1:(A−I)v=0A−I=[−111−1]. RREF is [10−10].
So x1−x2=0⇒x1=x2. Let x2=s. Then v=s[11]. Eigenvectors are s[11] for s=0.
For λ2=−1:(A−(−1)I)v=(A+I)v=0A+I=[1111]. RREF is [1010].
So x1+x2=0⇒x1=−x2. Let x2=s. Then v=s[−11]. Eigenvectors are s[−11] for s=0.
Let A be an n×n matrix. The characteristic polynomial of A is the n-th degree polynomial CA(λ)=det(A−λI).
If there is no risk of confusion, we will sometimes write C(λ) instead of CA(λ).
Theorem 6.2.8
A scalar λ is an eigenvalue of an n×n matrix A if and only if CA(λ)=0.
Let A be an n×n matrix with eigenvalue λ. We call the nullspace of A−λI the eigenspace of A corresponding to λ. The eigenspace is denoted Eλ.
e.g. Find all eigenvalues and a basis for each eigenspace for A=10021012−2.
CA(λ)=det(A−λI)=1−λ0021−λ012−2−λ.
Since A is upper triangular, the determinant is the product of the diagonal entries:
CA(λ)=(1−λ)(1−λ)(−2−λ)=(1−λ)2(−2−λ).
Set CA(λ)=0⇒(1−λ)2(−2−λ)=0.
Eigenvalues are λ1=1 (with algebraic multiplicity 2) and λ2=−2 (with algebraic multiplicity 1).
For λ1=1:(A−I)v=0A−I=1−10021−1012−2−1=00020012−3.
RREF: 00020012−3R2/200020011−3R1−R2,R3+3R2000200010R1/2000100010.
So x2=0,x3=0. x1 is free. Let x1=t.
Eigenvectors v=t100. Basis for Eλ1 is {100}.
For λ2=−2:(A−(−2)I)v=(A+2I)v=0A+2I=1+20021+2012−2+2=300230120.
RREF: 300230120R2/330021012/30R1−2R23000101−4/32/30=300010−1/32/30R1/3100010−1/92/30.
So x1−91x3=0⇒x1=91x3.
x2+32x3=0⇒x2=−32x3. Let x3=t.
Eigenvectors v=t1/9−2/31. Basis for Eλ2 is {1/9−2/31}.
Theorem 6.2.13
If A is an n×n upper or lower triangular matrix, then the eigenvalues of A are the diagonal entries of A.
Let A be an n×n matrix with eigenvalue λ1. The algebraic multiplicity of λ1, denoted aλ1, is the number of times that λ1 is a root of the characteristic polynomial C(λ). That is, if C(λ)=(λ−λ1)kC1(λ), where C1(λ1)=0, then aλ1=k. The geometric multiplicity of λ1, denoted gλ1, is the dimension of its eigenspace. So, gλ1=dim(Eλ1).
e.g. For the matrix A=10021012−2, which is upper triangular, we have λ1=1 and λ2=−2, with algebraic multiplicities aλ1=2 and aλ2=1.
Since Eλ1=Span{[100]} and Eλ2=Span{[1/9−2/31]}, the geometric multiplicities are gλ1=dim Eλ1=1 and gλ2=dim Eλ2=1.
e.g. Find the geometric and algebraic multiplicity of all eigenvalues of A=−11−36−063−15. (Middle row has −0, assuming it's 0). So A=−11−36063−15.
CA(λ)=det(A−λI)=−1−λ1−36−λ63−15−λ.
CA(λ)=(−1−λ)[−λ(5−λ)+6]−1[6(5−λ)−18]+(−3)[−6−(−3λ)] (expansion along first column).
=(−1−λ)(−5λ+λ2+6)−(30−6λ−18)−(−3)(−6+3λ)=(−1−λ)(λ2−5λ+6)−(12−6λ)+18−9λ=(−1−λ)(λ−2)(λ−3)−12+6λ+18−9λ=(−1−λ)(λ−2)(λ−3)+6−3λ=(−1−λ)(λ2−5λ+6)−3(λ−2)=−λ3+5λ2−6λ−λ2+5λ−6−3λ+6=−λ3+4λ2−4λ=−λ(λ2−4λ+4)=−λ(λ−2)2.
So eigenvalues are λ1=0 (algebraic multiplicity aλ1=1) and λ2=2 (algebraic multiplicity aλ2=2).
For λ1=0:(A−0I)v=Av=0.
−11−36063−15RREF100010−11/30. RREF is 100010−11/30x1−x3=0⇒x1=x3.
x2+1/3x3=0⇒x2=−1/3x3. Let x3=t.
v=t1−1/31. Eλ1=Span{1−1/31}. So gλ1=1.
For λ2=2:(A−2I)v=0.
A−2I=−1−21−360−263−15−2=−31−36−263−13.
RREF: −31−36−263−13R1/(−3)11−3−2−26−1−13R2−R1,R3+3R1100−200−100.
x1−2x2−x3=0⇒x1=2x2+x3. Let x2=s,x3=t.
v=2s+tst=s210+t101.
Eλ2=Span{210,101}. So gλ2=2.
Lemma 6.2.20
Let A and B be similar matrices, then A and B have the same characteristic polynomial, and hence the same eigenvalues.
Theorem 6.2.21
If A is an n×n matrix with eigenvalue λ1, then 1≤gλ1≤aλ1.
6.3 Diagonalization
An n×n matrix A∈Mn×n(R) is said to be diagonalizable if A is similar to a diagonal matrix D∈Mn×n(R). If P−1AP=D, then we say that P diagonalizes A.
Remark: For now, we will restrict ourselves to diagonalizing real matrices with real eigenvalues. That is, if A has a non-real eigenvalue, then we will say that A is not diagonalizable over R. In Section 6.5, we will look at diagonalizing matrices over the complex numbers.
Theorem 6.3.2
An n×n matrix A is diagonalizable (over R) if and only if there exists a basis {v1,...,vn} for Rn of eigenvectors of A.
e.g. Consider the mapping L:R2→R2 that rotates vectors about the diagonal y=x. (This is reflection across y=x).
Its standard matrix [L]=[0110] as explored on a previous example, where we found the eigenpairs (λ1,v1) and (λ2,v2), with λ1=1,λ2=−1 and v1=[11],v2=[1−1].
Since we have got 2 eigenvectors, we can write a basis of eigenvectors B={v1,v2}.
What is then the B-matrix of L?
Well, L(v1)=λ1v1 and L(v2)=λ2v2, so
[L(v1)]B=[λ10] and [L(v2)]B=[0λ2]⇒[L]B=[λ100λ2]=[100−1].
Let S be the standard basis. We have [L]S=SPB[L]BBPS.
Here SPB=P=[v1v2]=[111−1].
Then BPS=P−1=1(−1)−1(1)1[−1−1−11]=−21[−1−1−11]=[1/21/21/2−1/2].
So [0110]=[111−1][100−1][1/21/21/2−1/2].
And [L] is diagonalizable.
Lemma 6.3.3
If A is an n×n matrix with eigenpairs (λ1,v1),(λ2,v2),...,(λk,vk) where λi=λj for i=j, then {v1,...,vk} is linearly independent.
Theorem 6.3.4
If A is an n×n matrix with distinct eigenvalues λ1,...,λk and Bi={vi,1,...,vi,gλi} is a basis for the eigenspace of λi for 1≤i≤k, then B1∪B2∪...∪Bk is a linearly independent set.
Diagonalizability Test
If A is an n×n matrix whose characteristic polynomial factors as CA(λ)=(λ−λ1)aλ1⋅⋅⋅(λ−λk)aλk where λ1,...,λk are the distinct eigenvalues of A, then A is diagonalizable if and only if gλi=aλi for 1≤i≤k.
Corollary 6.3.6
If A is an n×n matrix with n distinct eigenvalues, then A is diagonalizable.
Algorithm
To diagonalize an n×n matrix A, or show that A is not diagonalizable:
Find and factor the characteristic polynomial C(λ)=det(A−λI).
Let λ1,...,λn denote the n roots of C(λ) (repeated according to multiplicity).
If any of the eigenvalues λi are not real, then A is not diagonalizable over R.
Find a basis for the eigenspace of each distinct eigenvalue λj by finding a basis for the nullspace of A−λjI.
If gλj<aλj for any λj, then A is not diagonalizable.
Otherwise, form a basis {v1,...,vn} for Rn of eigenvectors of A by using Theorem 6.3.4. Let P=[v1⋅⋅⋅vn].
Then, P−1AP=diag(λ1,...,λn) where λi is an eigenvalue corresponding to the eigenvector vi for 1≤i≤n.
e.g. Show that A=−11−36063−15 is diagonalizable and find an invertible matrix P and a diagonal matrix D such that P−1AP=D.
From earlier, CA(λ)=−λ(λ−2)2.
Eigenvalues: λ1=0 (aλ1=1) and λ2=2 (aλ2=2).
For λ1=0, Eλ1=Span{1−1/31}. So gλ1=1. Since aλ1=gλ1=1.
For λ2=2, Eλ2=Span{210,101}. So gλ2=2. Since aλ2=gλ2=2.
Since algebraic and geometric multiplicities match for all eigenvalues, A is diagonalizable.
P=1−1/31210101. D=000020002.
e.g. Show that A=[1011] is not diagonalizable.
CA(λ)=1−λ011−λ=(1−λ)2=0.
Eigenvalue λ=1 with aλ=2.
For λ=1:(A−I)v=0⇒[0010][x1x2]=[00].
This gives x2=0. x1 is free. Let x1=t. Eigenvectors are t[10].
Eλ=Span{[10]}. So gλ=1.
Since gλ=1<aλ=2, A is not diagonalizable.
e.g. Show that A=[01−10] is not diagonalizable over R.
CA(λ)=−λ1−1−λ=λ2+1=0.
λ2=−1⇒λ=±i.
Since eigenvalues are not real numbers, A is not diagonalizable over R.
P−1AP=[i00−i] complex diagonalization.
Theorem 6.3.13
If λ1,...,λn are all the n eigenvalues of an n×n matrix A (repeated according to algebraic multiplicity), then
det A=λ1⋅⋅⋅λn and trA A=λ1+⋅⋅⋅+λn.
e.g. Find all eigenvalues of A=1000010−10 and verify that det A=λ1λ2λ3 and trA A=λ1+λ2+λ3.
CA(λ)=det(A−λI)=1−λ000−λ10−1−λ=(1−λ)((−λ)(−λ)−(−1)(1))=(1−λ)(λ2+1).
Eigenvalues are λ1=1,λ2=i,λ3=−i.
det A = 1(0−(−1))−0+0=1.
λ1λ2λ3=1⋅i⋅(−i)=−i2=−(−1)=1. Verified.
tr A = 1+0+0=1.
λ1+λ2+λ3=1+i+(−i)=1.
6.4 Powers of Matrices
Theorem 6.4.1
Let A be an n×n matrix. If there exists a matrix P and diagonal matrix D such that P−1AP=D, then
Ak=PDkP−1
Let A=[1−124]. Show that A1000=PD1000P−1=[21001−3100021000−31000−21001+2⋅31000−21000+2⋅31000].
(This requires finding P and D for A first.)
CA(λ)=1−λ−124−λ=(1−λ)(4−λ)−(2)(−1)=4−λ−4λ+λ2+2=λ2−5λ+6=(λ−2)(λ−3).
Eigenvalues λ1=2,λ2=3.
For λ1=2:A−2I=[−1−122]→[10−20]x1−2x2=0⇒v1=[21].
For λ2=3:A−3I=[−2−121]→[10−10]x1−x2=0⇒v2=[11].
P=[2111], D=[2003].
P−1=2−11[1−1−12]=[1−1−12].
D1000=[210000031000].
A1000=PD1000P−1=[2111][210000031000][1−1−12]=[2⋅210001⋅210001⋅310001⋅31000][1−1−12]=[21001210003100031000][1−1−12]=[21001(1)+31000(−1)21000(1)+31000(−1)21001(−1)+31000(2)21000(−1)+31000(2)]=[21001−3100021000−31000−21001+2⋅31000−21000+2⋅31000].
6.5 Complex Diagonalization
Consider the standard matrix of the rotation mapping Rθ:R2→R2:
[Rθ]=[cosθsinθ−sinθcosθ](0≤θ<2π)
(a) Thinking geometrically, should this matrix have any real eigenvalues? (Generally no, unless θ=0 or θ=π. A rotation changes the direction of every vector, unless the rotation is by 0 or π radians.)
(b) Can you confirm your answer to (a) by studying the roots of C(λ)?
C(λ)=cosθ−λsinθ−sinθcosθ−λ=(cosθ−λ)2−(−sinθ)(sinθ)=cos2θ−2λcosθ+λ2+sin2θ=λ2−2λcosθ+1=0.
Roots: λ=22cosθ±4cos2θ−4=22cosθ±4(cos2θ−1)=22cosθ±−4sin2θ=22cosθ±2i∣sinθ∣=cosθ±i∣sinθ∣.
Real if sinθ=0, which means θ=0 or θ=π.
If θ=0:R0=[1001]. Eigenvalue λ=1. Eλ1=Span{[10],[01]}.
If θ=π:Rπ=[−100−1]. Eigenvalue λ=−1. Eλ2=Span{[10],[01]} (eigenvectors are any non-zero vectors). Eλ2=span{[−10],[0−1]}.
e.g. What if θ=π/2:λ=cos(π/2)±isin(π/2)=0±i(1)=±i.
For λ=i: (A−iI)v=[−i1−1−i]v=0. [−i1−1−i]R2↔R1[1−i−i−1]R2+iR1[10−i0].
x1−ix2=0⇒x1=ix2. Let x2=t. v=t[i1].
For A=[01−10], for λ=i: (A−iI)=[−i1−1−i]. RREF [10−i0]. x1−ix2=0⇒x1=ix2. Eigenvector [i1].
The set M2×2(C) is defined by M2×2(C)={A=[a11a21a12a22]∣a11,a12,a21,a22∈C}. An element A of M2×2(C) is called a (complex) matrix.
In M2×2(C), matrix addition and scalar multiplication follow as expected. Given A=[a11a21a12a22],B=[b11b21b12b22] and α∈C
scalar multiplication: αA=[αa11αa21αa12αa22]∈M2×2(C)
Matrix-vector and matrix-matrix products for complex matrices and vectors also follow as we did for R2 and M2×2(R).
Theorem 1.6.3 (This theorem number is from Chapter 1, likely reused here for properties in Cn)
If x,y,w∈Cn and c, d∈C, then
x+y∈Cn.
(x+y)+w=x+(y+w).
x+y=y+x.
There exists a vector 0∈Cn such that x+0=x for all x∈Cn.
For x∈Cn there exists (−x)∈Cn such that x+(−x)=0.
cx∈Cn.
c(dx)=(cd)x.
(c+d)x=cx+dx.
c(x+y)=cx+cy.
1x=x.
e.g. The matrix A=[11−11] has no real eigenvalues. We will view this as a matrix in M2×2(C) and search for complex eigenvalues and eigenvectors just as we did over R.
(a) Show that the two complex roots of C(λ) are λ1=1+i and λ2=1−i.
C(λ)=det(A−λI)=1−λ1−11−λ=(1−λ)2−(−1)(1)=(1−λ)2+1.
Set (1−λ)2+1=0⇒(1−λ)2=−1⇒1−λ=±i.
So λ=1∓i. Thus λ1=1−i,λ2=1+i (or vice versa).
(b) Determine all z∈C2 such that Az=(1+i)z.
(A−(1+i)I)z=0. A−(1+i)I=[1−(1+i)1−11−(1+i)]=[−i1−1−i].
[−i1−1−i]R1↔R2[1−i−i−1]R2+iR1[10−i0].
z1−iz2=0⇒z1=iz2. Let z2=t. z=t[i1].
(c) Determine all w∈C2 such that Aw=(1−i)w.
(A−(1−i)I)w=0. A−(1−i)I=[1−(1−i)1−11−(1−i)]=[i1−1i].
[i1−1i]R1↔R2[1ii−1]R2−iR1[10i0].
w1+iw2=0⇒w1=−iw2. Let w2=t. w=t[−i1].
(d) Construct a matrix P=[zw] such that z is a nonzero solution from (b) and w is a nonzero solution from (c).
Let t=1 for both. P=[i1−i1].
(e) Determine a matrix P−1 such that PP−1=I=P−1P.
det P = i(1)−(−i)(1)=i+i=2i.
P−1=2i1[1−1ii]=2−i[1−1ii]=[−i/2i/2−i2/2−i2/2]=[−i/2i/21/21/2].
(f) Calculate matrix P−1AP.
This should be D=[1+i001−i].
(g) Can you use your work in (a) - (f) to calculate A100?
Yes. A=PDP−1. So A100=PD100P−1.
D100=[(1+i)10000(1−i)100].
1+i=2(cos(π/4)+isin(π/4))=2eiπ/4.
(1+i)100=(2)100ei100π/4=250ei25π=250(cos(25π)+isin(25π))=250(−1)=−250.
1−i=2(cos(−π/4)+isin(−π/4))=2e−iπ/4.
(1−i)100=(2)100e−i100π/4=250e−i25π=250(cos(−25π)+isin(−25π))=250(−1)=−250.
So D100=[−25000−250]=−250I.
A100=P(−250I)P−1=−250PIP−1=−250I=[−25000−250].